AITopics | image fusion

Collaborating Authors

image fusion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conditional Controllable Image Fusion

Neural Information Processing SystemsFeb-18-2026, 08:45:20 GMT

Due to the dynamic differences of different samples, our CCF employs specific fusion constraints for each individual in practice.

artificial intelligence, image fusion, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.46)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection

Neural Information Processing SystemsFeb-14-2026, 18:35:38 GMT

Multimodal image fusion and object detection are crucial for autonomous driving. While current methods have advanced the fusion of texture details and semantic information, their complex training processes hinder broader applications. Addressing this challenge, we introduce E2E-MFD, a novel end-to-end algorithm for multimodal fusion detection.

information, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.66)
Transportation > Ground > Road (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

Add feedback

Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model

Neural Information Processing SystemsFeb-12-2026, 04:28:11 GMT

Existing multi-modal image fusion methods fail to address the compound degradations presented in source images, resulting in fusion images plagued by noise, color bias, improper exposure, etc .

artificial intelligence, information fusion, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

89ef9ce35c7833cba14bb2381ead6c54-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 14:24:27 GMT

geoscience and remote sensing, ieee transaction, remote sensing, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Singapore (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.93)

Industry: Energy (0.33)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

03dbc11a22e79cd38bea53cf518c2371-Paper-Conference.pdf

Neural Information Processing SystemsDec-27-2025, 20:58:39 GMT

artificial intelligence, image fusion, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection

Neural Information Processing SystemsDec-26-2025, 03:47:48 GMT

artificial intelligence, name change, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.88)

Add feedback

Test-Time Dynamic Image Fusion

Neural Information Processing SystemsDec-23-2025, 17:53:12 GMT

The inherent challenge of image fusion lies in capturing the correlation of multi-source images and comprehensively integrating effective information from different sources. Most existing techniques fail to perform dynamic image fusion while notably lacking theoretical guarantees, leading to potential deployment risks in this field. Is it possible to conduct dynamic image fusion with a clear theoretical justification? In this paper, we give our solution from a generalization perspective. We proceed to reveal the generalized form of image fusion and derive a new test-time dynamic image fusion paradigm. It provably reduces the upper bound of generalization error.

artificial intelligence, name change, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach

Li, Jiayang, Jiang, Chengjie, Jiang, Junjun, Liang, Pengwei, Ma, Jiayi, Nie, Liqiang

arXiv.org Artificial IntelligenceDec-9-2025

Image fusion aims to blend complementary information from multiple sensing modalities, yet existing approaches remain limited in robustness, adaptability, and controllability. Most current fusion networks are tailored to specific tasks and lack the ability to flexibly incorporate user intent, especially in complex scenarios involving low-light degradation, color shifts, or exposure imbalance. Moreover, the absence of ground-truth fused images and the small scale of existing datasets make it difficult to train an end-to-end model that simultaneously understands high-level semantics and performs fine-grained multimodal alignment. We therefore present DiTFuse, instruction-driven Diffusion-Transformer (DiT) framework that performs end-to-end, semantics-aware fusion within a single model. By jointly encoding two images and natural-language instructions in a shared latent space, DiTFuse enables hierarchical and fine-grained control over fusion dynamics, overcoming the limitations of pre-fusion and post-fusion pipelines that struggle to inject high-level semantics. The training phase employs a multi-degradation masked-image modeling strategy, so the network jointly learns cross-modal alignment, modality-invariant restoration, and task-aware feature selection without relying on ground truth images. A curated, multi-granularity instruction dataset further equips the model with interactive fusion capabilities. DiTFuse unifies infrared-visible, multi-focus, and multi-exposure fusion-as well as text-controlled refinement and downstream tasks-within a single architecture. Experiments on public IVIF, MFF, and MEF benchmarks confirm superior quantitative and qualitative performance, sharper textures, and better semantic retention. The model also supports multi-level user control and zero-shot generalization to other multi-image fusion scenarios, including instruction-conditioned segmentation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2512.0717

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
North America > United States (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.96)
(2 more...)

Add feedback

SGDFuse: SAM-Guided Diffusion for High-Fidelity Infrared and Visible Image Fusion

Zhang, Xiaoyang, Li, jinjiang, Fan, Guodong, Ju, Yakun, Fan, Linwei, Liu, Jun, Kot, Alex C.

arXiv.org Artificial IntelligenceNov-25-2025

Infrared and visible image fusion (IVIF) aims to combine the thermal radiation information from infrared images with the rich texture details from visible images to enhance perceptual capabilities for downstream visual tasks. However, existing methods often fail to preserve key targets due to a lack of deep semantic understanding of the scene, while the fusion process itself can also introduce artifacts and detail loss, severely compromising both image quality and task performance. To address these issues, this paper proposes SGDFuse, a conditional diffusion model guided by the Segment Anything Model (SAM), to achieve high-fidelity and semantically-aware image fusion. The core of our method is to utilize high-quality semantic masks generated by SAM as explicit priors to guide the optimization of the fusion process via a conditional diffusion model. Specifically, the framework operates in a two-stage process: it first performs a preliminary fusion of multi-modal features, and then utilizes the semantic masks from SAM jointly with the preliminary fused image as a condition to drive the diffusion model's coarse-to-fine denoising generation. This ensures the fusion process not only has explicit semantic directionality but also guarantees the high fidelity of the final result. Extensive experiments demonstrate that SGDFuse achieves state-of-the-art performance in both subjective and objective evaluations, as well as in its adaptability to downstream tasks, providing a powerful solution to the core challenges in image fusion. The code of SGDFuse is available at https://github.com/boshizhang123/SGDFuse.

artificial intelligence, image understanding, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.05264

Country:

Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
Asia > China > Shandong Province > Yantai (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SWIR-LightFusion: Multi-spectral Semantic Fusion of Synthetic SWIR with Thermal IR (LWIR/MWIR) and RGB

Hussain, Muhammad Ishfaq, Van Linh, Ma, Naz, Zubia, Fatima, Unse, Ko, Yeongmin, Jeon, Moongu

arXiv.org Artificial IntelligenceNov-25-2025

Enhancing scene understanding in adverse visibility conditions remains a critical challenge for surveillance and autonomous navigation systems. Conventional imaging modalities, such as RGB and thermal infrared (MWIR / LWIR), when fused, often struggle to deliver comprehensive scene information, particularly under conditions of atmospheric interference or inadequate illumination. To address these limitations, Short-Wave Infrared (SWIR) imaging has emerged as a promising modality due to its ability to penetrate atmospheric disturbances and differentiate materials with improved clarity. However, the advancement and widespread implementation of SWIR-based systems face significant hurdles, primarily due to the scarcity of publicly accessible SWIR datasets. In response to this challenge, our research introduces an approach to synthetically generate SWIR-like structural/contrast cues (without claiming spectral reproduction) images from existing LWIR data using advanced contrast enhancement techniques. We then propose a multimodal fusion framework integrating synthetic SWIR, LWIR, and RGB modalities, employing an optimized encoder-decoder neural network architecture with modality-specific encoders and a softmax-gated fusion head. Comprehensive experiments on public RGB-LWIR benchmarks (M3FD, TNO, CAMEL, MSRS, RoadScene) and an additional private real RGB-MWIR-SWIR dataset demonstrate that our synthetic-SWIR-enhanced fusion framework improves fused-image quality (contrast, edge definition, structural fidelity) while maintaining real-time performance. We also add fair trimodal baselines (LP, LatLRR, GFF) and cascaded trimodal variants of U2Fusion/SwinFusion under a unified protocol. The outcomes highlight substantial potential for real-world applications in surveillance and autonomous systems.

artificial intelligence, information fusion, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10586-025-05792-1

2510.13404

Country: